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Abstract. Consider d uniformly random permutation matrices on n labels. Consider the sum of 
these matrices along with their transposes. The total can be interpreted as the adjacency matrix 
of a random regular graph of degree 2d on n vertices. We consider limit theorems for various 
combinatorial and analytical properties of this graph (or the matrix) as n grows to infinity, either 
when d is kept fixed or grows slowly with n. In a suitable weak convergence framework, we prove 
(N- that the (finite but growing in length) sequences of the number of short cycles and of cyclically non- 

(-H 1 backtracking walks converge to distributional limits. We estimate the total variation distance from 

the limit using Stein's method. As an application of these results we derive limits of linear functionals 
of the eigenvalues of the adjacency matrix. A key step in this latter derivation is an extension of the 
Kahn-Szemeredi argument for estimating the second largest eigenvalue for all values of d and n. 
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1. Introduction 



We consider several asymptotic enumeration and analytic problems for sparse random regular 
graphs and their adjacency matrices. A graph is called regular if every vertex has the same degree; a 
sparse regular graph is typically one for which the degree d is either constant or of a far smaller order 
than the number of vertices n. A classical model is the uniform distribution over all d-regular graphs 
on n labeled vertices; a thorough survey on properties of the uniform model can be found in (Wor99 . 

Our model of choice is the more recent permutation model: Consider d many iid uniformly random 
permutations {7Ti, . . . , ir^} on n vertices labeled {1, 2, . . . , n}. A graph can be constructed by adding 
■<^j- \ one edge between each pair (i,Ttj(i)); thus every vertex i has edges to 7Tj(i) and nj (i) for every 

QV permutation ttj, for a total degree of 2d. As the reader will note, this allows multiples edges and 

self-loops, with each self-loop contributing two to the degree of its vertex. However, one can still ask 
the usual enumeration questions about this graph, e.g., the distribution of the number of cycles. 



Another way to represent this graph is by its adjacency matrix, which is an n x n matrix whose 
(i,j)ih entry is the number of edges between i and j, with self-loops counted twice. This random 
matrix can be now studied in it own right; for example, one can ask about the distribution of its eigen- 
values. Note that — trivially — the top eigenvalue is 2d; the distribution of the rest of the eigenvalues 
is an interesting question. For the uniform model of random regular graphs (or Erdos-Renyi graphs) 
such questions have been studied since the pioneering work |McK81j . Among the more recent articles, 
see |FO05j . [TVW10] . and [DP 10) . We refer the reader to [DP10 for a more exhaustive review of the 
vast related literature. 

Our results touch on both aspects. We consider two separate scenarios, either when d is independent 
of n, or when d grows slowly with n. We will assume throughout that d > 2; the reason for this is 
that the d = 1 case has been dealt with (in a larger context) by jBADll . 

The paper is divided into three thematically separate but mathematically dependent parts. 

(i) Section [3j Joint asymptotic distribution of a growing sequence of short cycles. It is well 
known in the classical models of random regular graphs that the number of cycles of length k, where k 
is small (typically logarithmic in n), is approximately Poisson. Sec BolOl] or |Wor99 for an account 
of older results, or [MWW04 for the best result in this direction. In Theorem [TU we prove this fact 
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for the permutation model, using Stein's method along with ideas from [LPlOj to estimate the total 
variation distance between a vector of the number of cycles of lengths 1 to r and a corresponding 
vector of independent Poisson random variables. This theorem holds for nearly the same regime of r, 
d, and n as in [MWW04, Theorem 1], and unlike that theorem gives an explicit error bound on the 
approximation. This bound is essential to our analysis of eigenvalue statistics in Section [5] 

The mean number of cycles is somewhat interesting. When d is fixed, for the uniform model of 
random 2o?-regular graphs, the limiting mean of the number of short cycles of length k is (2d — l) k /2k. 
For the permutation model, the limiting mean is the slightly different quantity a(d,k)/2k, where 



See also (LMMW09, Theorem 4.1], in which the authors consider a different model of random regular 
graph and find that the limiting mean number of cycles of length k differs slightly from both of these. 

Next we consider the number of short non-backtracking walks on the graph; a non-backtracking 
walk is a closed walk that never follows an edge and immediately retraces that same edge backwards. 
We actually consider cyclically non-backtracking walks (CNBWs), whose definition will be given in 
Subsection l3.2l Non-backtracking walks are important in both theory and practice as can be seen from 
the articles |Fri08] and ABLS07 . We consider the entire vector of cyclically non-backtracking walks 
of lengths 1 to r n , where r„ is the "boundary length" of short walks/cycles, and is growing to infinity 
with n. In Theorem [3TJ we assume that d is independent of n. We prove that the vector of CNBWs, 
as a random sequence in a weighted i 2 space, converges weakly to a limiting random sequence whose 
finite-dimensional distributions are linear sums of independent Poisson random variables. 

When d grows slowly with n (slower than any fixed power of n, which is the same regime studied 
in |DP10j ). a corresponding result is proved in Theorem [22l Here, we center the vector of CNBW 
for each n. The resulting random sequence converges weakly to an infinite sequence of independent, 
centered normal random variables with unequal (<j\ = 2k) variances. 



(ii) Section 0J An estimate of Cy/2d — 1 for the second largest (in absolute value) eigen- 
value for any (d,n). The spectral gap of the permutation model, for fixed d, has been intensely 
studied recently in Fri08] for the resolution of the Alon conjecture. This conjecture states that the 
second largest eigenvalue of 'most random regular graphs' of degree 2d is less than 2y/2d — 1 + e; the 
assumption is that d is kept fixed while n grows to infinity. This important conjecture implies that 
'most' sparse random regular graphs are nearly Ramanujan (see [LPS88 ). Friedman's work builds on 
earlier work |FK81j . |BS87j . and [Fri91j . Although |Fri08] and related works consider the permuta- 
tion model, for fixed d, their results also apply to other models due to various contiguity results; see 
[Wor99l Section 4] and |G.TKW02] . 

To develop the precise second eigenvalue control that we require in Section we have followed a 
line of reasoning that originates with Kahn and Szemeredi [FKS89] , This approach has been used 
recently to great effect by [BFSU99] , [FO05] . and |LSVllj . to name a few. With this technique we 
are able to show that the second largest eigenvalue is bounded by 40000-\/2<i — 1 with a probability 
at least 1 — CVi" 1 for some universal constant C (see Theorem |2"4"|) . We have not attempted to find 
an optimal constant, and instead we focus on extricating the d and n dependence in the bound. 

Both BFSU99] and [LSV11] provide examples of how the Kahn-Szemeredi argument can be used to 
control the second eigenvalue when d grows with n. In |BFSU99j . the authors work in the configuration 
model to obtain the 0(^/d) bound for d = 0(y/n), essentially the largest d for which the configuration 
model represents the uniform d-regular graph well enough to prove eigenvalue concentration. In 
jLSVll) , the authors study the spectra of random covers. The permutation model is an example of such 
a cover, where the base graph is a single point with d self loops. Using the Kahn-Szemeredi machinery, 
they are able to show an 0(-Jd\ogd) bound with d(n) = poly(n) growth. The adaptations to the 
original Kahn-Szemeredi argument made in |LSV11] , especially the usage of Freedman's martingale 
inequality, are similar to the ones made here. However, as we do not need to consider the geometry 
of the base graph, we are able to push this argument to prove a non-asymptotic bound of the correct 
order. 




(2d — l) fc — 1 + 2c?, when k is even, 
(2d-l) k + l. when k is odd. 
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(iii) Section[5] Limiting distribution of linear eigenvalue statistics of the rescaled adjacency 
matrix. Our final section is in the spirit of Random Matrix Theory (RMT). Let A n denote the 
adjacency matrix of a random regular graph on n vertices. By linear statistics of the spectrum we 
mean random variables of the type 53i=i wnere ^1 > • ■ ■ > \i ar e the n eigenvalues of the 

symmetric matrix (2d — 1) _1 / 2 A„. We do this rescaling of A n irrespective of whether d is fixed or 
growing so as to keep all but the first eigenvalue bounded with high probability. 

The limiting distribution of linear eigenvalue statistics for various RMT models such as the classical 
invariant ensembles or the Wigner/ Wishart matrices has been (and continues to be) widely studied. 
For the sake of space, we give here only a brief (and therefore incomplete) list of methods and papers 
which study the subject. For a more in-depth review, we refer the reader to |AG Z10 . 

The first, and still one of the most widely used methods of approach is the method of moments, 
introduced in |Wig55| , used in |Jon82] and perfected in |SS98) for Wigner matrices (it also works for 
Wishart); this method is also used here in conjunction with other tools. Explicit moment calculations 
alongside Toeplitz determinants have also been used in determining the linear statistics of circular 
ensembles |Sze52| . [DEOlj . |Joh88j . 

Other methods include the Stieltjes transform method (also known as the method of resolvents), 
which was employed with much success in a series of papers of which we mention }BS04] and |LP09j : the 
(quite analytical) method of potentials, which works on a different class of random matrices including 
the Gaussian Wigner ones |Joh98j ; stochastic calculus [CDOlj ; and free probability ,KMS07 a . Finally, 
a completely different set of techniques were explored in iCha09j . 

Recently and notably, for a single permutation matrix, such a study has been approached in [WieOOj 
and completed in [BAD11 ; our results share several features with the latter paper. 

A noteworthy aspect in all these is that when the function / is smooth enough (usually analytic), 
the variance of the random variables X)"=i /(^») typically remains bounded. This is attributed to 
eigenvalue repulsion; see [BAGllj Section 21.2.2] for further discussion. Even more interestingly, 
there is no process convergence of the cumulative distribution function. This can be guessed from the 
fact that when the function / is rough (e.g., the characteristic function of an interval), the variance 
of the linear statistics grows slowly with n (as seen for example in jCL95] and [SosOOj ) . One major 
difference our models have with the classical ensembles is that our matrices are sparse; their sparsity 
affects the behavior of the limit. 

In Theorems 1351 and 1391 we prove limiting distributions of linear eigenvalue statistics. For fixed d, 
the functions we cover are those that are analytically continuable to a large enough ellipse containing a 
compact interval of spectral support. When d grows we need functions that are slightly more smooth. 
Let (Tfc)fceN be the Chebyshcv polynomials of the first kind on a certain compact interval; since they 
constitute a basis for L 2 functions, any such function admits a decomposition in a Fourier-like series 
expressed in terms of the Chebyshev polynomials. The required smoothness is characterized in terms 
of how quickly the truncated series converges in the supremum norm to the actual function on the 
given interval. 

In Theorem[35j we consider d to be fixed. The limiting distribution of the linear eigenvalue statistics 
is a non-Gaussian infinitely divisible distribution. This is consistent with the results in |BADll| . 
Theorem [35] proves a Gaussian limit in the case of a slowly growing d after we have appropriately 
centered the random variables. This transition is expected. In [DP10 the authors consider the 
uniform model of random regular graphs and show that when d is growing slowly, the spectrum of 
the adjacency matrix starts resembling that of a real symmetric Wigner matrix. Similar techniques, 
coupled with estimates proved in this paper, could be used to extend such results to the present model. 

The proofs in this section follow easily from the results in parts (i) and (ii) above. As in [DP 10] . 
the proofs display interesting combinatorial interpretations of analytic quantities common in RMT. 
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2. A WEAK CONVERGENCE SET-UP 

The following weak convergence set-up will be used to prove the limit theorems in the later text. 
Let w := (w m ) me N be a sequence of positive weights that decay to zero at a suitable rate as m tends to 
infinity. Let L 2 (w) denote the space of sequences (a; m ) m6 N that are square-integrable with respect to 
(J, i.e., J2m=i x m UJ rn < oo. Our underlying complete separable metric space will be X — (L 2 (w), ||-||), 
where ||-|| denotes the usual norm. 

Remark 1 . Although we have chosen to work with L 2 for simplicity, any L p space would have worked 
as well. 

Let us denote the space of probability measures on the Borel er-algebra of X by F(X). We will 
skip mentioning the Borel er-algebra and refer to a member of V(X) as a probability measure on X . 
We equip ¥(X) with the Prokhorov metric for weak convergence; for the standard results on weak 
convergence we use below, please consult Chapter 3 in [EK86i . Let p denote the Prokhorov metric on 
P(X) x P(X) as given in [EK861 eqn. (1.1) on page 96]. 

Lemma 2. The metric space (P(X),p) is a complete separable metric space. 

Proof. The claim follows from [EK86, Thm. 1.7, p. 101] since X is a complete separable metric 
space. □ 

To prove tightness of subsets of F(X) we will use the following class of compact subsets of L 2 (u;). 
Lemma 3 (The infinite cube). Let (a m ) m gN € ^(w) be such that a m > for every m. Then the set 

{{b m )meN € L 2 (cj) : < |6 m | < a m for all m e N} 
is compact in (L 2 (w), ||-||). 

Proof. First observe that the cube is compact in the product topology by Tychonoff 's theorem. Norm 
convergence to the limit points now follows by the Dominated Convergence Theorem. □ 

We now explore some consequences of relative compactness. 

Lemma 4. Suppose {X n } and X are random sequences taking values in L 2 (w) such that X n converges 
in law to X. Then, for any b G L 2 (cj), the random variables (b,X n ) converges in law to (b,X). 

Proof. This is a corollary of the usual Continuous Mapping Theorem. □ 

Our final lemma shows that finite- dimensional distributions characterize a probability measure on 
the Borel er-algebra on X. 

Lemma 5. Let x be a typical element in X . Let P and Q be two probability measures on X . Suppose 
for any finite collection of indices (ii, . . . ,ik), the law of the random vector (x^ , . . . , Xi k ) is the same 
under both P and Q. Then P = Q on the entire Borel a -algebra. 

Proof. Our claim will follow once we show that P and Q give identical mass to every basic open 
neighborhood determined by the norm; however, the norm function x <— > \\x\\ is measurable with 
respect to the cr-algebra generated by coordinate projections. Now, under our assumption, every 
finite-dimensional distribution is identical under P and Q; hence the probability measures P and Q 
are identical on the coordinate cr-algebra. This proves our claim. □ 

3. Some results on Poisson approximation 

3.1. Cycles in random regular graphs. Let G n be the 2<i-regular graph on n vertices sampled 
from G n ,2d, the permutation model of random regular graphs. The graph G n is generated from the 
uniform random permutations 7Ti,...,7rd as described in the introduction. Assume that the vertices 
of G n are labeled by {1, . . . ,n}, and let denote the number of (simple) cycles of length k in G n . 

We start by giving the limiting distribution of as n — > oo. Suppose that w = Wi ■ ■ ■ Wk is 
a word on the letters 7Ti,...,TTd and ir^ 1 , . . . j^ 1 - We call w cyclically reduced if w± 7^ w^ 1 and 
Wi 7^ w i+i f° r 1 < i < k. Let a(d, k) denote the number of cyclically reduced words of length k on 
this alphabet. 
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Proposition 6. As n — > oo while k and d are kept fixed, 

k [ 2k 



We will actually give a stronger version of this result in Theorem [SJ but we include this proposition 
nevertheless because it has a more elementary proof, and because in proving it we will develop some 
lemmas that will come in handy later. We also note the following exact expression for a(d, k), 

(1) a(d, 2k) = {2d - l) 2k -l + 2d, and a(d, 2k + 1) = (2d - l) 2k+1 + 1, 

whose proof we provide in the Appendix (see Lemma HTj) . 

Our argument heavily uses the concepts of [LP10] , but we will try to make our proof self-contained. 
Let W be the set of cyclically reduced words of length k on letters m, ■ ■ ■ ,7Td and Trf , . . . ,tt~{ . For 
w € W, we define a closed trail with word w to be an object of the form 

mi w 2 w 3 w k 
S 9- Si s 2 ' >■ s k = s 

with Si £ {1, . . . , n}. In Section 13. 1[ we will consider only the case where so, ... , Sk-i are distinct, 
though we will drop this assumption in Section[32I We say that the trail appears in G n if wi(so) = si, 
tt>2(si) = S2, and so on. In other words, we are considering G„ as a directed graph with edges labeled 
by the permutations that gave rise to them, and we are asking if it contains the trail as a subgraph. 
We note that a trail (with distinct vertices) can only appear in G n if its word is cyclically reduced. 

To give an idea of the method we will use, we demonstrate how to calculate lim n ->oo E[(7^]. 
Suppose we have a trail with word w. Let e % w be the number of times TTi or 7r~ appears in w. It is 
straightforward to see that the trail appears in G n with probability Yii=i i /[ n ]e i i where 

[x]j = x{x - 1) • • • (x - j + 1) 

is the falling factorial or Pochhammer symbol. 

For every word in VV, there are [n]k trails with that word. The total number of trails of length k 
contained in G n is 2k times the number of cycles, so 

(2) 2fcE[C^]= EM*]} J— 

mew »=i ^ Je ™ 

Each summand converges to 1 as n — > oo, giving E[C^ n ' ) ] — > a(d, fc)/2fc, consistent with Proposition [6] 
To prove Proposition[6j we will need to count more complicated objects than in the above example, 
and we will need some machinery from jLPIO] . Suppose we have the following list of r trails with 



(3) 



w l »°2 w h 

Sq 3- 6 1 s- • • • S k 

with sj £ {1, . . . , n}. Though we take the vertices Sg, . . . , si, of each trail to be distinct, vertices 
from different trails may coincide (see Figure [T] for an example) . 

Suppose we have another list of r trails, (uj, < i < fc, 1 < j < r) with the same words w , . . . , w r . 
We say that these two lists are of the same category if s\ = s], ^=J> u\ = u\, . Roughly speaking, 
this means that the trails in the two lists overlap each other in the same way. The probability that 
some list of trails appears in G n depends only on its category. 

We can represent each category as a directed, edge-labeled graph depicting the overlap of the trails. 
This is more complicated to explain than to do, and we encourage the reader to simply look at the 
example in Figure [1] or at Figure 7 in [LPIO) . Given the list of trails (s{ ), we define this graph as 
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2 — ^1 — i^3^-^4 — ^2 



{«§} 



7Tl i S 2' S l} 7Ti 



{4} 




Figure 1. A list of two trails, and the graph associated with its category. Since 
s 2 = s 3 = 3, the vertices an d s| are blocked together in the graph, and since 
s i = s 2 = 1) tne vertices and s\ are blocked together. 




Figure 2. The graph r associated with a single trail with word 7^7^ 1 7r27ri7r27r 3 . 

follows. First, reconsider the variables sj simply as abstract labels rather than elements of {1, . . . , n}, 
and partition these labels by placing any two of them in a block together if (considered as integers 
again) they are equal. The graph has these blocks as its vertices. It includes an edge labeled 7r< from 
one block to another if the trails include a step labeled TTi or 7T" 1 from any vertex in the first block to 
any vertex in the second; this edge should be directed according to whether the step was labeled TTi 



Suppose that T is the graph of a category of a list of trails, and define to be the number of 
tuples of trails of category T found in G n . If V is the graph of a category of a list of a single trail 



with word w € W, we write X 
in Figured) 



" l ' for Xp . Note that such graphs have a simple form demonstrated 



Lemma 7. 



lim E[^4 n 



1 if r has the same number of vertices as edges 
otherwise 



To demonstrate the connection to the calculation we performed in ([2]) , observe that 



2kC 



in) _ 



(n) 
w ' 



wew 



and by our lemma the expectation of this converges to a(d, k) as n — > oo. 
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Figure 3. A graph formed from three trails of length 6, all identified with each 
other. There are a(d, 6) choices for the edge-labels w. There are six choices for which 
element s 2 will be identified with sj, and two choices for how to orient the trail s 2 
when identifying it with s . There are also six choices for which element s% will be 
identified with sj, along with another two choices for its orientation. All together, 
there are a(d, 6) (2 • 6) 2 elements of Q' corresponding to the partition of three elements 
into one part. 

Proof of Lemma This is essentially the same calculation as in ([2]) . Let e and v be the number of 
edges and vertices, respectively, of the graph T. Let ej be the number of edges in L labeled by iti. 

There are [n] v different trails of category L, corresponding to the number of ways to assign vertices 
{ 1 , . . . , n} to the vertices of F. Since each of these trails appears in G n with probability J\i=i 1 / \ n ] &i i 

(4) E[lW] = [iIIo- 

i—l L J 1 

As n — > oo, this converges to if e > v and to 1 if e = v. If L is the graph of a category of a list 
of trails, then every vertex has degree at least 2, so it never happens that e < v, which completes 
the lemma. We note for later use that this remains true even when we drop the requirement that all 
vertices of a trail be distinct, so long as the word of each trail is cyclically reduced. □ 

Proof of Proposition [51 We will use the moment method. Fix a positive integer r. The main idea of 
the proof is interpret (C^) r as the number of r-tuples of cycles of length k in G n . As there are 2k 
closed trails for every cycle of length k, we can also think of it as (2fc)~ r times the number of r-tuples 
of closed trails of length k in G n . 

Let Q be the set of graphs of categories of lists of r trails of length k. The above interpretation 
implies that 

By Lemma El we can compute linin^oo E(C^ ) r by counting the number of graphs in Y with the 
same number of edges as vertices. Let Q' C Q be the set of such graphs. 

Let r € G' , and consider some list of r trails of category Y. Since Y has as many edges as vertices, 
it consists of disjoint cycles. This implies that for any two trails in the list, either the trails are wholly 
identified in Y, or they are are disjoint. These identifications of the r different trails give a partition 
of r objects. 

Given some partition of the r objects into m parts, we will count the graphs in Q' whose trails are 
identified according to the partition (see Figure [3] for an example) . Consider some part consisting of 
p trails. The trails form a cycle in T; we need to count the number of different ways to label the edges 
and vertices. There are a(d, k) different ways to label the edges. Each trail in the part can have its 
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vertices identified with those of the first trail in 2k different ways, for a total of (2fc) p_1 choices. Thus 
the number of choices for this part is a(d, k)(2k) p ~ 1 . Doing this for every part in the given partition, 
we have a total of a(d, k) m (2k) r ~ m . Recalling that the number of partitions of r objects into m parts 
is given by the Stirling number of the second kind <S*(r, to), 



\g>\ = J2S(r,m)a(d,kr(2ky- 



m—1 



By (|5j) and Lemma [3 



■m—\ ^ 

It is well known that this is the rth moment of the Poi(a(d, fc)/2fc) distribution (see for example 
[Pit97] ). and that this distribution is determined by its moments, thus proving the theorem. □ 

(n) 

This theorem tells us the limiting distribution of CI as n — ¥ oo, with d and fc fixed, but tells us 
nothing if d and k grow with n. The following theorem addresses this, and gives us a quantitative 
bound on the rate of convergence. We will assume throughout that d > 2; we use this assumption only 
to simplify some of our asymptotic quantities, but as far better results for the d = 1 case are already 
known (see |AT92j ). we see no reason to complicate things. For clarity, we state this and future 
results with an explicit constant rather than big-0 notation, but it is the order, not the constant, that 
interests us. Recall that the total variation distance between two probability measures is the largest 
possible difference between the probabilities that they assign to the same event. 



Theorem 8. There is a constant Cq such that for any n, k, and d > 2, the total variation distance 
between the law of and Poi(a(d, k)/2k) is bounded by Cok(2d — l) k /n. 

Proof. We will prove this using Stein's method; good introductions to Stein's method for the Poisson 
distribution can be found in |CDM05| . |BC05j . and especially [BHJ92] . which focuses on the the 
technique of size-biased coupling that we will employ. We give here the basic set-up. Let Z + denote 
the nonnegative integers. For any A C Z + , let g = 3a, A be the function on Z + satisfying 

Aff(j + 1) - jg(j) = l jeA - Poi(A){A} 

with g(0) — 0, where Poi(A){A} denotes the measure of A under the Poi(A) distribution. This function 
g is the called the solution to the Stein equation. For any nonnegative integer-valued random variable 
X, 

(6) PLY £ A] — Poi(A){A} = E[Xg(X + 1) - Xg(X)}. 

Bounding the right hand side of this equation over all choices of g thus bounds the total variation 
distance between the law of X and the Poi(A) distribution. The following estimates on g are standard 
(see [BHJ92, Lemma 1.1.1], for example): 

(7) ||0|| oo <nun(l,A- 1 / a ) ) Ag < min(l, A -1 ), 

where Ag = sup 3 \g(j + 1) - g(j)\. 

Let C be the set of closed trails of length k on n vertices, with two trails identified if one is a cyclic 
or inverted cyclic shift of one another. Elements of C are essentially cycles in the complete graph on 
n vertices, with edges labeled by tti, . . . , ltd and 7rf , . . . , n^ 1 . We note that \C\ = [n]fca(d, k)/2k. 

For t € C, let F t = In occurs j n q\. Let A = a(d, k)/2k. We abbreviate to C, and we note that 
C = J2tec Ft- We can evaluate the right hand side of © as 

E[A ff (C + 1) - Cg{C)] = ( ^[9(0 + 1)] - E[F s g(C)} 



sec 



\n\k 



Let p t = E[F t ]. We note that F s g(C) = F s g(J2t^s F * + !)) and tnat 
F,[F s g(j2 F t + 1 )] =P s E[g(j2 F t + lj 
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In Lemma [TU1 we will construct for each s € C a random variable Y s on the same probability space as 
C that has the distribution of Y^t^s ^* conditioned on F s = 1. Then we evaluate 



\B{Xg(C+l)-Cg(C)]\ 



£ ( r^E[g(C + 1)] - Ps E[.g(r s + 1)] 
sec ^ ^ 

<Ei E i5( c+i )-^+i)i+i: 



sec 



sec 



1 

[n]k 



E\g(Y s + i)\ 



We bound these terms as follows: 

\g(C+l)-g(Y s + l)\<Ag\C-Y s \ 

and 



1 


< 


i i 


n — Ps 








[n] fc n fe 



< 



This last bound makes use of the inequality [n]k > n (1 — k 2 /2n 

\E[Xg(C + 1) - C ff (C)]| < £ ^E|C - F s | + |C| 

sec 



2n[n] fc 

Applying these bounds gives 
k 2 



(8) 



M fc s^ 



o 



2n[n\k 
V ™ 



To get a good bound on this, we just need to demonstrate how to construct Y s so that E|C — Y s \ 
is small. We sketch our method as follows: Fix s € C, and let G' n be a random graph on n vertices 
distributed as G n conditioned to contain the cycle s. We will couple G' n with G n in a natural way, 
and then prove in Lemma [9] that G n and G' n differ only slightly. We then define Y s in terms of G' n , 
and we establish in Lemma [TU] that E|C — Y s \ is small. Finally, we finish the proof of Theorem [5] by 
using these results to bound the right side of ©. 

We start by constructing G' n . Fix some s € C. The basic idea is to modify the permutations 
7Ti, . . . , TTd to get random permutations tt^, . . . ,7r^, which we will then use to create a 2d- regular graph 



G' n in the usual way. Before we give our construction of 7r^, 
they should have. Suppose for example that d = 3 and s is 



1 



Tr' d , we consider what distributions 



1. 



To force G' n to contain s, should be a uniform random permutation conditioned to make 71^(4) = 1 
and 71"! (3) = 2, ~k' 2 a uniform random permutation with no conditioning, and 7Tg a uniform random 
permutation conditioned to make 71-3(1) = 2 and 7r 3 (3) = 4. 

We now describe the construction of 71^, . . . , 71^. Suppose s has the form 



•so 



si 



■ S2 



Sfc 



so- 



, then a m 

the tail. We must construct 7r ; ' to have the uniform distribution conditioned on 



(9) 

(The element s is actually an equivalence class of the 2fc different cyclic and inverted cyclic shifts of 
the above trail, but we will continue to represent it as above.) Let 1 < I < d, and suppose that the 
edge-labels 717 and nf 1 appear M times in the cycle s, and let (a m , b m ) for 1 < m < M be these edges. 
If (a m , b m ) is labeled 717, then a m is the tail and b m the head of the edge; if it is labeled 7r ; " 
is the head and b 

T^'Mm) = b rn for (a m , b m ), 1 <m < M 

We define a sequence of random transpositions by the following algorithm: Let T\ swap 717(01) and 
b\. Let r 2 swap T 1 7T;(a 2 ) and b 2 , and so on. We then define 7Tj = ■ ■ ■ r^. This permutation satisfies 
7T;(a m ) = b m for 1 < m < M, and it is distributed uniformly, subject to the given constraints, which 
is easily proven by induction on each swap. This completes our construction of 7r(, . . . ,7Tj. 

We now define G' n to be the random graph on n vertices with edges (i,7r^ (i)) for every 1 < i < n 
and 1 < j < d. It is evident that G' n is defined on the same probability space as G and is distributed 
as G n conditioned on containing s. The key fact is that G' n is nearly identical to G n : 
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Lemma 9. Suppose there is an edge i >• j contained in G n but not in G' n . Then the trail s 

contains either an edge of the form i v or of the form v — *—>■ j . 

Proof. Suppose 7r;(i) = j, but Tt[(i) ^ j. Then j must have been swapped when making ttt, which can 
happen only if iri(a m ) = j or b m = j for some m. In the first case, a m — i and s contains the edge 

i ->■ b m with b m 7^ j, and in the second s contains the edge a m >■ j with a m 



If s contains an edge of the form i v or of the form v — j , then G' n cannot possibly 

contain i *- j while still containing s. The preceding lemma then says that we have coupled G n 

and G' n as best we can, in the following sense: G' n keeps as many edges of G n that it can, given that 
it contains s. 

For t G C, let F[ — 1^ G > conta inst)- Define Y s by Y s = J2t^s^t- Since G' n is distributed as G n 
conditioned to contain s, the random variable Y s is distributed as ^2 t ^ s F t conditioned on F s = 1. 
We now proceed to bound E|C — Y s \, adding in the minor technical condition that k < n 1 / 6 . 



Lemma 10. There exists an absolute constant C\ with the following property. For any s € C and Y s 
defined above, and for all n, k, and d > 2 satisfying k < n 1 / 6 , 



(10) 



Cik(2d- 1) 



Proof. We start by partitioning the cycles of C according to how many edges they share with s. Define 

C-i as all elements in C that contain an edge Si >■ v with v ^= Si+\ or an edge v »- Sj+i with 

v 7^ Si. For < j < k, define Cj as all elements of C \ C_i that share exactly j edges with s. 

The sets C_i, . . . ,Ck-i include every element of C except for s. Loosely, this classifies elements of 
C according their likelihood of appearing in G' n compared to in G„: trails in C-i never appear in G' n ] 
trails in Co appear in G' n with nearly the same probability as in G n ; and the trails in Ci appear in G' n 
considerably more often than in G n . 

This classification of elements of C works nicely with our coupling. Suppose t £ Ci for i > 0. 
Lemma IH1 shows that if t appears in G„, it must also appear in G' n . That is, F[ > F t for all t £ Ci for 
i > 0. On the other hand, F' t = for all t e C-\. Using this, 



nc-Y s 



E 



< Ps 



E 



teC- 



(Ft 



F' t ) 



F' t ] 



E 

tec 



E 



(Ft -Ft 



k-l 



i=l ted 



F't) 



tec 



E 



i=i teCi 



k-l 



(11) 



< 



E[^] + ^E[F;-F t ]+^^E[F;-F t ] 

tec_! tec i=i tec 4 

fe-i 

E ft + E^-^) + EE^ 

tec_i tec t=i teCi 



with = E[F/]. 

The rest of the proof is an analysis of \Ci\ and of p' t . We start by considering the first sum. 

For any edge s$ *-v with w ^ Si + i or v ^ «i+i with v 7^ Sj, there are no more than 

[n — 2]k-2(2d — l) fc_1 trails containing that edge (identifying cyclic and inverted cyclic shifts). This 
gives the bound 



\C-i \ <2fc(n-2)[n-2] fe _ 2 (2d-l) fc 



k-l 



LIMIT THEOREMS 



11 



Applying p t <l/[n] k , 

tec-! v 

For the next sum, we note that with e\ denoting the number of times ni and 7T" 1 appear in the word 
of t, for for any t 6 Co, 

d ^ d ^ 



Thus we have p£ < l/[n — fc]fc and pt > Using the bound \Co\ < |C| = a(d, k)[n]k/2k, we have 

E04-ft)^ a(d,fc)[n]fc/r 1 1 

tec 




The last and most involved calculation is to bound \Ci\. Fix some choice of i edges of s. We start by 
counting the number of cycles in Cj that share exactly these edges with s. We illustrate this process 
in Figure HI Call the graph consisting of these edges H, and suppose that H has p components. Since 
it is a forest, H has i + p vertices. 

Let A\, . . . , A p be the components of H . We can assemble any t £ Ci that overlaps with s in H 
by stringing together these components in some order, with other edges in between. Each component 
can appear in t in one of two orientations. Since we consider t only up to cyclic shift and inverted 
cyclic shift, we can assume without loss of generality that t begins with component A\ with a fixed 
orientation. This leaves (p — 1)!2 P choices for the order and orientation of A%, . . . , A p in t. 

Imagine now the components laid out in a line, with gaps between them, and count the number of 
ways to fill the gaps. Each of the p gaps must contain at least one edge, and the total number of edges 
in all the gaps is k — i. Thus the total number of possible gap sizes is the number of compositions of 
k — i into p parts, or ( J^ 1 ). 

Now that we have chosen the number of edges to appear in each gap, we choose the edges themselves. 
We can do this by giving an ordered list k — p — i vertices to go in the gaps, along with a label and 
an orientation for each of the k — i edges this gives. There are [n ~ p — i]k- p -i ways to choose the 
vertices. We can give each new edge any orientation and label subject to the constraint that the word 
of t must be reduced. This means we have at most 2d — 1 choices for the orientation and label of each 
new edge, for a total of at most (2d — l) fe ~ J . 

All together, there are at most (p - 1)!2 J>_1 C"^ 1 ) [n —p — i]k- p -i(2d - l) k ~ i elements of d that 
overlap with the cycle s at the subgraph H. We now calculate the number of different ways to choose 
a subgraph H of s with i edges and p components. Suppose s is given as in ©. We first choose a 
vertex Sj. Then, we can specify which edges to include in H by giving a sequence a\, b\, . . . , a p , b p 
instructing us to include the first a\ edges after Sj in H, then to exclude the next b\, then to include 
the next 02, and so on. Any sequence for which at and bi are positive integers, a\ + • • • + a p = i, and 
bi + • • • + bp = k — i gives us a valid choice of i edges of s making up p components. This counts each 
subgraph H a total of p times, since we could begin with any component of H. Hence the number of 
subgraphs H with i edges and p components is (''pl^ 1 )- This gives us the bound 

iA(k-i) , . i\/7 - i\2 



, , i I — \ \ ( k - I - J. 

p=l 
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10 : ni 



The cycle s, with H dashed. The sub- 
graph H has components A\, . . . , A p . In 
this example, p — 3, k — 11, and i = 4. 



"2 



7I"3 



71"! 



7T1 



10 



Step 1. We lay out the components 
Ai,...,A p . We can order and orient 
A 2 , . . . ,A p however we would like, for a 
total of (p - 1Y.2P- 1 choices. Here, we 
have ordered the components A\, As, A 2 , 
and we have reversed the orientation of 
A 3 . 



23 7T 3 1 




21 711 14 



Step 2. Next, we choose how many edges 
will go in each gap between components. 
Each gap must contain at least one edge, 
and we must add a total oik — i edges, 
giving us (kpl^ 1 ) choices- ln this exam- 
ple, we have added one edge after A\, 
three after A 3 , and two after A 2 . 



Step 3. We can choose the new vertices 
in [n — p—i]k-p-i ways, and we can direct 
and give labels to the new edges in at 
most (2d — l) k ~ l ways. 



Figure 4. Assembling an element t e d that overlaps with s at a given subgraph H. 
We apply the bounds < ^/{p - 1)! and (^V) ^ ( e ( fc - » - l)/(p - l)) p_1 to get 

1+ ^ p(( 



iA(k-i) 1 / ^2^3 N p-1 



(p-l)V [n-l-i]„- 

Since fc < n 1 ' 6 , the sum in the above equation is bounded by an absolute constant. Using the bound 
Pt < — k]k-i for t e Ci, we have 
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and 

i=i tec s v 

These estimates, along with p s < l/[n]fc, complete the proof. □ 

All that remains now is to apply this lemma to finish the proof of Theorem [8] First, consider the 
case where k > n 1 ^ 6 . Then k(2d — l) k /n > 1 for sufficiently large values of n (regardless of d), in 
which case the theorem is trivially satisfied. By choosing Cq large enough, it holds for all n with 
k > n 1 / 6 . 

When k < n 1 / 6 , we apply Lemma [TO] and © to © to get 

|»PWC + 1) - C 9 (C)„ . *>W> (^l) + W" 1 ^ 



n]k \ n J \ n 
- o / fc(2rf— l) fc \ +0 (k?' 2 {2d- l) fe / 2 



The first term is larger than the second for all but finitely many pairs (k,d) with d > 2. Hence there 
exists Co large enough that for all n, k, and d > 2, 

|E[A g (C + f)-C,g(C)]|< Cofc(2 ^ 1)fc . □ 

We will need a multivariate version of this theorem as well. Define (c£ ; k > 1) to be indepen- 
dent Poisson random variables, with C^.°°' ) having mean a(d,k)/2k. Let e?Tv(^;^) denote the total 
variation distance between the laws of X and Y . 

Theorem 11. There is a constant C2 such that for all n, k, and d > 2, 

C 2 {2d-\) 2r 



LTV 



((cf\...,cW), {c[°°\...,c^) 



< 



Our proof will be very similar to the single variable case above, except that we use Stein's method 
for Poisson process approximation (see |BHJ921 Section 10.3]). Let = a(d,k)/2k, and let a G 7L\ 
be the vector with ith entry one and all other entries zero. Define the operator A by 

r r 

Ah(x) = Xk(h(x + eu) - h{x)) + 2J Xk(h(x - e&) - h(x)) 
fe=i fe=i 

for any h : — > M and a; € ZIj_ . We now describe the function that plays a role analogous to g in the 
single variable case. 

Lemma 12. For any set A C P,, there is a function h: — > R such that 

Ah(x) = i xeA - P [(dT\ . . . , C(°°)) e A] . 

This function h has the following properties: 

(12) sup \h(x + e k ) - h(x)\ < 1, 

l<fe<r 

(13) sup |/i(a; + ej + e^) — h(x + ej) + h(x) — h(x + efe)| < 1. 

l<j,k<r 

Proof. This follows from Proposition 10.1.2 and Lemma 10.1.3 in [BHJ92 as applied to a point process 
on a space with r elements. □ 
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Our goal is thus to bound ~E[Ah(c[ n \ . . . , Cr )] for any function h as in Lemma [T2l We will 

abbreviate this vector to C = (c[ n \ . . . , Cr). The set of equivalence class of closed trails of length 
k, which we previously denoted C, we will now call C k . 

E[Ah(C)]=f2J2 (^E[h(C + e k )-h(C)] + E[F a (h(C-e k )-h(C))]) 
k =isec» K[nlk J 

= EE ( T^nHC + e k ) ~ h(C)} + p s E[h(C ~ e k ) ~ h(C) \ F s = l]) . 

For every s € C k , we will construct on the same probability space as C a random variable Y s such 
that 



(14) 



Then 



tec* 



F s = 1. 



|e[^(c)] 



fc=l seC fc 



^ I] ( t4-E[/i(C + e fc ) - h(C)} + Ps E[h(Y s ) - h(Y s + e k ) 
fc =i se c^ V[nJfc 
I i 

< V V r^E|^(C + e fc ) - ft(C) + fc(Y.) - /i(Y s + e fe )| 

E|/i(Y s )-/i(Y s +e fe )|. 



1 



\n\ k 



+EE 

fc=l sS c fc 

By (fT21 and (1131) , respectively, 

\h(Y s )-h(Y + e k )\ < 1, 
|ft(C + e fc )-/i(C) + ftCY»)-ft(Y + e fc ) < UC-Y.^. 



Hence 



1 



| E [^(c)]| <E E o7 E H c_ Ys iii+E E 
^EE^iic- Y .,ii 1 + tic fc i 



n\k 
k 2 



k=l sG C* 



= EE rV E H c - Y -'lli + 



fc=i 



r(2rf- 1) 



fe=i sec 

Theorem [TT1 then follows from the following lemma: 

Lemma 13. There exists an absolute constant C3 with the following property. For any 1 < k < r 
and s € C , let Y s be distributed as in (|14[) . There is a coupling of C and Y s such that for all n, k, 
and d > 2 satisfying k < n 1 ^ 6 , 



(15) 



EIIC-Y^ < 



C 3 r(2d - l) r 



Proof. This proof is nearly identical to that of Lemma [101 We construct as before the graph G' n and 
the random variables F[ for t £ C l , 1 < i < r. Then Y s can be defined in the natural way as 

y. = (£i*, E ^ E^ E ^->E^ 



tec 1 



tec*- 1 tec fc tec k + 1 



tec r 
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Figure 5. The walk 1^2^3— !-4^5^2^1is non-backtracking, but not 
cyclically non-backtracking. Note that such walks have a "lollipop" shape. 



We define Ct_x, ■ ■ ■ iQj-iUfc as before, and it remains true that F[ > F t if t € Cj for j > 0, and F( — 
if t E C l _ 1 . Doing the calculation just as in 

((i-l)Afc \ 
E p*+E(p*-f*)+ E Ef* +p - 
tecii tecs 3=1 tecj / 

Nearly identical calculations as in Lemma [TU] show that 

E „_ (i2t!£!), 

tec* 

tec- 

which completes the proof. □ 

3.2. Non-backtracking walks in random regular graphs. We now seek to transfer our results 
on cycles to closed non-backtracking walks. Note that we consider G n as an undirected graph when 
we discuss walks on it. A non-backtracking walk is one that begins and ends at the same vertex, and 
that never follows an edge and immediately follows that same edge backwards. Let NBWjj."' 1 denote 
the number of closed non-backtracking walks of length k on G n . 

If the last step of a closed non-backtracking walk is anything other than the reverse of the first 
step, we say that the walk is cyclically non-backtracking. Cyclically non-backtracking walks on G n 
are exactly the closed non-backtracking walks whose words are cyclically reduced. Cyclically non- 
backtracking walks are easier to analyze than plain non-backtracking walks because every cyclic and 
inverted cyclic shift of a cyclically non-backtracking walk remains cyclically non-backtracking. Let 

(n) 

CNBW^ denote the number of closed cyclically non-backtracking walks of length k on G n . 

These notions sometimes go by different names. In |Fri08j . non-backtracking walks are called 
irreducible, and NEW^™" 1 is called IrredTr^ (G) . Cyclically non-backtracking walks are called strongly 
irreducible, and CNBW^™ ) is called SIT fe (G). 

Recall that (C^, 00 " 1 ; k > 1) are independent Poisson random variables, with C^,°°' ) having mean 
a(d,k)/2k. Define 

CNBWj^ =E 2 i C ?° ) - 

3*|fe 
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For any cycle in G n of length j\k, we obtain 2j non-backtracking walks of length k by choosing a 
starting point and direction and then walking around the cycle repeatedly. We start by decomposing 
CNBWfc into these walks plus the remaining "bad" walks that are not repeated cycles. We denote 
these as B^ , giving us 

(16) CNBWjf 5 = 2 J' C 'i n) + Bi k ] ■ 

j\k 

The results of Section 13.11 give us a good understanding of . Our goal now is to analyze . 
Specifically, we will show that in the right asymptotic regime, it is likely to be zero, implying that 
CNBW[™ ) will converge to CNBW*: 00 ' . We start with a more precise version of Lemma [7] 

Lemma 14. With the setup of Lemma^\ suppose that T has k vertices and e edges, with e > k. Then 
for all n > e, 



E[X 



(«)] 1 
r J ^ [n - k] e _ h 



< 



[n - % 



Proof. This is apparent from ((4]). □ 
Proposition 15. For all n > 2k, 

i=i 

Proof. Any closed cyclically non-backtracking walk can be thought of as a trail, with repeated vertices 

(n) 

in the trail now allowed. Such a walk is counted by B k if and only if the graph of its category has 
more edges than vertices. Let Gd consist of all graphs of categories of a closed trail of length k that 
have more edges than vertices. Then 

B M = £ X (n) > 

using the notation of Section 13.11 To use Lemma Q3J we classify the graphs in Gd according how to 
many more edges than vertices they contain: 

E[B^ 1 < |{r € Gd ■ T has exactly i more edges than vertices} It —. 

A graph in Gd has at most k edges, so the terms with i > k in this sum are zero. By Lemma 18 in 
[LPlOj . for each word w £ W, the number of graphs in Gd with word w and with i more edges than 
vertices is at most fc 2l+2 , completing the proof. □ 

It is worth noting that this proposition fails if the word "cyclically" is removed from the definition 
of B^ . The problem is that walks that are non- backtracking but not cyclically non- backtracking can 
have as many vertices as edges. 

Corollary 16. There is an absolute constant C5 such that for all n, r, and d > 2, 

P[Bi n) > for some k < r] < ° 5r ^ 2d ~ ^ . 
Proof. Bounding the expression from Proposition 1151 by a geometric series, 

Efflwi < a(rf ' r)r4 n : 2r 2 . 

n — r n — 2r — r z 

If r > n 1 / 4 , then r 4 (2d — l) r /n > 1, and the corollary is trivially true for any C5 > 1. Thus we may 
assume that r < n 1 / 4 . In this case, the expression (n — 2r)/(n — 2r — r 2 ) is bounded by an absolute 
constant. This and ([T]) imply that for some constant C4, 

C 4 r 4 (2d-l) r 



e[b("»] < 
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Since is integer-valued, 



P[B£ n ) > for some k < r n ] < ^ P[5f ) > 0] < ^ E[Sf >] 



fc=i fe=i 

C 4 fc 4 (2d-l) fc C 5 r 4 (2d-1)' 



fe=i 



< 



for some choice of the constant Ck 



□ 



The following fact follows directly from the definition of total variation distance, and we omit its 
proof. 

Lemma 17. Let X and Y be random variables on a metric space S , and let T be any metric space. 
For any measurable f : S — > T , 

d TV (f(X)J(Y))<dT V (X,Y). 
It is now straightforward to give a result on non-backtracking walks analogous to Theorem 1111 
Proposition 18. There is a constant Cq such that for all n, r, and d > 2, 

C 6 (2d-l) 2r 



d TV ((CNBWi n) ; 1 < k < r) , (CNBW^; 1 < k < r)) < 



Proof. We start by recalling the decomposition of CNBWj™' into good and bad walks given in ([16 
Let G ( h n) = J2 3 \k 2jC*j Tl) , so that CNBw[ n) = + By Lemma [H] and TheoremOU 



((g£°; 1 < k < r), (CNBW^; 1 < k < r)) < d TV ((cf >; 1 < k < r), (C^; 1 < fc < r) 



(17) 

Then for any ACZ r + , 

An) 



< 



C 2 (2d-l) 



2r 



(CNBW^; 1 < k < r) e A 



P [(CNBWjf ; I <k <r) E A 



< P 



^i n) ; 1 < k <r) e A 



U { fl r > °} 



U-l 



(CNBW 



(°°). 



l<Kr)ei 



< 



C 2 (2d-l) 2r C 5 r 4 (2d-l) r 



by (jTTJ) and Corollary [If)] For any n and d, since d > 2 and thus 2d — 1 > 3, the first term is larger 
than the second for all but at most a finite number of rs, bounded independently of n and d. Therefore 
there exists a constant Cq satisfying the conditions of the theorem. □ 

Corollary 19. For any fixed r and d > 2, 

(CNBW^ n) , . . . , CNBWf n) ) A (CNBW!°°\ . . . , CNBW^) 

as n — > oo. 

To achieve a version of the above corollary that holds when d grows, we need to center and scale 
our random variables CNBWj^ . 

Proposition 20. Letrbefixed, and suppose that d = d(n) — > oo as n — > oo , andthat(2d—\) 2r = o(n). 



» 



Let CNBWfc" = (2d - l)- fc / 2 (CNBW^ ' - E^NBWjf ]). Let Z X ,...,Z T be independent normal 
random variables with ~EZk — and ~EiZ^ = 2k. Then as n — > oo, 

- — (n) — (n) c 

(CNBW! , . . . , CNBW r ) (Z u . . . , Z r ). 
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Proof. Let xj. n) = (2d- l)- fc / 2 (CNBW^ oo) - EfCNBWjj. ^]). We note that CNBW^ oo) depends on d 
(and hence on n), although we have suppressed this dependence from the notation. By Proposition [T5] 



» 



and Lemma[T71 the total variation distance between the laws of (CNBW fc ; 1 < k < r) and (X^ ; 1 < 
k < r) converges to zero as n — >• oo. Hence it suffices to show that (xj^; 1 < k < r) (Z\, . . . , Z r ) 



as n — > oo. 



Let Afc = a(d, k)/2k as in Theorem [TT1 We can write -Xl as 

4 n) = 2k(2d - iy k ' 2 {Ct ] - A fc ) + (2d - 1)- fc / 2 ^ (2jCj oo) - o(d, i)) . 

i<fe 

Using ([I]), it is a straightforward calculation to show that as n — » oo, 
(2fc(2d - l)- fe / 2 (C< oo) - A fc ) ; 1 < k < rj -A (Z 
Hence we need only show that for all k < r, 

(2d^l)- k /^(^ n) ~a(d,j))^0. 



j\k 
j<k 



We calculate 



Var 



{2d - l)- fe / 2 Y, (2jq n) - a(d, jj) = (2d - iy k ]T Md; 
j\k i j\k 

j<k j<k 



J) 



and the statement follows by ([!) and Chebyshev's inequality. □ 

The remaining results in this section refer to the weak convergence set-up in Section [5] 
Theorem 21. Suppose that d is fixed, that r n oo, and that 
(18) (2d- l) 2r " = o(n). 

Let 

e fc = EfCNBWi 00 '] 2 = £>a(d,j) + I ^fl(d,j) 

Le< (frfc)fcgN any fixed positive summable sequence. Define the weights of Section® by setting 

w fe = b k /G k , k G N. 

Let P n &e i^e law of the sequence (CNBW£ n) , . . . , CNBW^, 0, 0, . . .). Then {P n }, considered 
sequence in P(X), converges weakly to the law of the random vector (CNBWj. 00 ^; k G N) . 

Proof. We first claim that the ranc 
follows by a deliberate choice of ur. 



as a 



Proof. We first claim that the random vector (CNBW fc °°' ) ; k G N ) almost surely lies in L 2 (w). This 



oo oo oo 



(CNBWi 00 ^ uj k = Y ® k uJk = J> fe < oo, 

fe=l k=l fc=l 

which proves finiteness almost surely. The computation of Ofc is straightforward. 

By Corollary 1 191 we know that all subsequential weak limits of P n have the same finite-dimensional 
distributions as (CNBW^ 00 " 1 ; k G N), and by Lemma El they are in fact identical to the law of 

(CNBW^sfc G N). Thus it suffices to show that {Pi,P 2 ,...} is tight. To do this we will apply 
Lemma [3] by choosing a suitable infinite cube. 

In other words, we must show that given any e > 0, there exists an element a — (a m ) m£ N G L 2 (cj) 
such that 



(19) supP \u r k n =1 fcNBW fc n) > ^ 



< e. 
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In fact, our choice of a is 



a k = (a + 2)E (CNBW^M = (a + 2) ^ a(d,j), 



for some positive a determined by e. Note that, by an obvious calculation, a £ L 2 (w). 
By Proposition [TS1 for any r\ > 0, 



(20) 



U- x {(3NBW< n) > a,}] < P [U^ 1 {CNBW^ > a,} 



for all sufficiently large n. Now, we apply the union bound 



(21) 



supP 



UJk {CNBW^ > a fc }] < ^P [CNBWjf > a, 



fe=i 



and bound the right side by a simple large deviation estimate. 
We start with the decomposition 



(22) 



CNBW 



(oo) 



(oo) 



where {Cj°°' ) } are independent Poisson random variables with mean a(d, j)/2j . Thus, for any A > 0, 
the exponential moments are easy to derive: 



E e 



j\k j\k ^ 2j ' 



exp 



Hence, by Markov's inequality, we get 



; 2A J - 1 
2j 



P ( CNBWr J >a k ) < e- AQfc E ( 
< exp 



^gACNBwi 00 ' 



f>) 

e 2Aj _ X 



2j 



(a + 2)A 



An easy analysis shows that if A = log2/(2fc), one must have 

e 2Aj _ 1 



2j 



< 2A, for all j < k. 



Hence, 



P ( CNBwi°° ) > a k ) < exp 



a log 2 
2l~ 



E a (^') 



< 2-"(2d-l) fc /2fe 



The above expression is clearly summable in k, and thus from (|21[) we get 



supP 



U^ =1 {CNBWI 00 ' > a k )] <Y^2-^ 2d -^ k / 21 



k=l 



The right side can be made as small as we want by choosing a large enough a. This is enough to 
establish flU]). □ 

We now prove a corresponding theorem when d is growing with n. Let (j, k (d) denote E[CNBWjf°^ 
emphasizing its dependence on d. We define 

(23) N { k n) = {2d- l)- fe / 2 (CNBWi n) - Hk{d)). 
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Theorem 22. Suppose that d — d(n) — » oo and r n — > oo as n — > oo. Suppose that 

(2d - l) 2r " = o(n). 

We define the weights lo by setting uj k — b k /(k 2 logk), where (bk)kefi is any fixed positive summable 
sequence. Let P n be the law of the sequence (iVj , . . . , Av™ , 0, 0, . . .). Let Zi,Z%, . . . be independent 
normal random variables with ~EZ k = and EZ 2 = 2k. Then P n , considered as an element ofV(X), 
converges weakly to the law of the random vector (Z k ; k G N). 

To proceed with the proof we will need a lemma on measure concentration. We will use a result 
on modified logarithmic Sobolev inequality that can be found in the Berlin notes by Ledoux |Led97| . 
For the convenience of the reader we reproduce (a slight modification of) the statement of Theorem 
5.5 in |Led971 page 71] for a joint product measure. Please note that although the statement of 
Theorem 5.5 is written for an iid product measure, its proof goes through even when the coordinate 
laws are different (but independent). In fact, the crucial step is the tensorization of entropy ( Lcd97, 
Proposition 2.2]), which is generally true. 

Lemma 23. For n G N, let /ii,/i2, • • • ,Hn be n probability measures on N. For functions f on N, 
define Df(x) = f(x + 1) — f(x) to be the discrete derivative. Define the entropy of f under \ii by 

Enfa (/) = E w (/ log /) - E Mi (/) log E w (/) . 

Assume that there exist two positive constants c and d such that for every f on N such that swp x \Df\ < 
A, one has 



Ent^ (ef) <ce dA E Ml (|-D/|V 



as functions of X. 



Let /j, denote the product measure of the fii 's. Let F be a function on N™ such that for every x G N n , 



y^\F(x + e t )- F(x)\ 2 < a 2 , and max \F(x + ei) - F(x)\ < 0. 

^ — ' Ki<n 
i=l 



Then E^di^j) < oo and, for every r > 0, 



M (F > E M (F) +r) < exp ( -^log ( 1 



/3dr 
Aca 2 



Proof of Theorem \'2'2l The proof is similar in spirit to the proof of Theorem l21l As in that proof, the 
limiting measure is supported on L 2 (w). By Proposition [20l and Lemma we need only show that 
the family {P\,P2, . . .} is tight. As in Theorem |2T| we need to choose a suitable infinite cube. 
Choose e > 0. Define 



afcyiogfe, 



for some positive a > 1 depending on e. Then a G L 2 (o/). 
We need to show that, for a suitable choice of a, 



supP 



u: 



< e. 



By Lemma [T71 and Proposition [TH1 for any r\ > 0, 



(24) 



N, 



(n) 



}] <p K-=i{ 



CNBW^' 



(oo) 



fi k (d)\ >a fc (2d-l) fc /2| 



for all sufficiently large n. 



Note as before that CNBWj. 00 ' 1 depends on d (and hence on n). 
Proceeding as before, we need to estimate 



(IcNBWj^-MkOOl >a k (2d-l) k ' 2 ) 



for our choice of a k . 
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Let Poi(#) denote as before the Poisson law with mean 9. We will denote expectation with respect 
to Poi(6*) by E^. As shown in Corollary 5.3 in |Led971 page 69], Poi(6*) satisfies the following modified 
logarithmic Sobolev inequality: for any / on N with strictly positive values 

(25) Ent^(f)<9E v J-\Df\ 2 



f 

Here Ent ffe (/) refers to the entropy of / under Poi(0). 

Let now / on N satisfy sup^, \Df(x)\ < A. By eqn. (5.16) in |Led971 page 70], (f25|) implies that 
Poi(#) satisfies the inequality 

(26) Ent*. (e f ) < Ce 2X ^ e (\Df\ 2 e') , for any C > 9. 

Now fix some k <E N and consider the product measure of the random vector (Cj°°\ j\k). Each 
coordinate satisfies inequality (|26[) and one can take the common constant C to be a(d, k)/2k. 

We apply Lemma |2"31 on the function F(x) = J2j\k^j x j- ^ ^ s straightforward to see that one can 
take a 2 = 4k 3 , f3 = 2k. Thus, we get the following tail estimate for any r > 0: 

(r ( 4kr \ 

Replacing F by — F we obtain a two-sided bound 

IT ( 4/c7* \ 

P(|F-E(F)|>r)<2cxp -— log 1 + 



8fc b \ leCfc 3 , 
Hence we have shown that for any r > 0, the following estimate holds 



r(oo) ' ' ' 



CNBWP -/i fc (d) 



> r) < 2exp ( --^-log ( 1 



16a(d,fc)fc 3 



2ex P( -FZ^Sl 1 



2a{d,k)k 



Recall from Q that a(d,k) - (2d - l) fc . Therefore 

CNBW^-M^d) > a k (2d - l) k ' 2 ) < 2exp f- ° fc(2d ~ ^ log f 1 + 



2 (2d - l) fc / 2 fc 

Now, log(l + a;) > x/2 for all < x < 1. Using this simple bound we get that for all (fc, d) such 
that a log k < 2(2d— l) fe , we have 

P (|cNBWi°°) - Mfc (d)| > a fe (2d- if' 2 ) < 2exp (- JjL) < 2exp (-^^) = 2fc- 2/32 . 

The right side is summable whenever a 2 > 32. The rest of the proof follows just as in Theorcm[5l] □ 



4. Spectral concentration 

The problem of estimating the spectral gap of a d-regular graph has been approached primarily 
in two ways, the method of moments and the counting method of Kahn and Szemeredi, prezented 
in FKS89 . The method of moments has been developed in the work of Broder and Shamir BS87 
and very extensively by Friedman |Fri91| . |Fri08j . In his work, Friedman, relying on d being fixed 
independently of n, developed extremely fine control over the magnitude of the second eigenvalue. 
On the other hand in |FKS89| , Kahn and Szemeredi only show that the second largest eigenvalue has 
magnitude 0(v / d). While weaker than Friedman's bound, their techniques readily extend to the case 
where d is allowed to grow as a function of n; this observation has been informally made by others, and 
communicated to us by Vu and Friedman. Here we will formalize it, and present the Kahn-Szemeredi 
argument in the context of growing d to demonstrate the method's validity, as well as to develop some 
handle on the constants in the bound. 

Specifically, we will prove 
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Theorem 24. For any m > 0, there is a constant C = C{m) and universal constants K and c so 
that 

P [a?: i= 1 : | Ai| > CVd\ < n- m + Kexp(-cn). 

Further, the constant C may be taken to be 36000 + 2400m. 

In what follows, let M be the adjacency matrix for the 2d-regular graph G n . Recall that this matrix 
can be realized by sampling independently and uniformly d permutation matrices Ai, A%, . . . , Ad and 
defining 

M = A x + A\ + A 2 + A\ + ■ ■ ■ + A d + A\. 

The starting point is the variational characterization of the eigenvalues Ai > A2 > • • • > A„ of M, 
which states that 

max{A2, |A„|} = sup |w/M«;| . 
mil 

||«j||=i 

Additional flexibility is provided by replacing this symmetric version of the Rayleigh quotient by the 
asymmetric version, 

sup \v*Mw\. 

u;,u_Ll 
1 1 v 1 1 — 1 1 w 1 1 — 1 

The random variables v t Mw, for fixed w and v, are substantially more tractable than the supremum. 
To be able to work with these random variables instead of the supremum, we will pass to a finite set 
of vectors which approximate the sphere S = {w _L 1 : \\w\\ — 1}. More specifically, we will only 
consider those w and v lying on the subset of the lattice T defined as 

T:= : z€ZM|z|| 2 <^,z±l 

for a fixed S > 0. 

Vectors from T approximate vectors from S in the sense that every (1 — 8)S is a convex 
combination of points in T. (See Lemma 2.3 of [FO05].) Thus 

— ^— sup \[l-S\v*M[l- 5}w\ < 1 sup \x*My\. 

1 1 v 1 1 — 1 1 w 1 1 — 1 

Furthermore, by a volume argument, it is possible to bound the cardinality of T as 

5 . (l + j)Vi" 

2j r(f + i) ■ 

Employing Stirling's approximation, this shows 



\T\ f-=J < VolfxeR" : \\x\\ < 1 

(1 + |)V2ot 



\T\<C 

for some universal constant C. 

The breakthrough of Kahn and Szemeredi was to realize that x t My can be controlled by virtue of 
a split into two types of terms. If x l My is written as a sum 

x t My= 2_j x u M uv y v + 2J x u M uv y v , 

(u.v) (u,v) 

then the contribution of the first sum turns out to be very nearly its mean because of the Lipschitz 
dependence of the sum on the edges of the graph. The contribution of the second sum turns out to 
never be too large for a very different reason: the number of edges between any two sets in the graph 
is on the same order as its mean. Following Feige and Ofek, for a fixed pair of vectors (x,y) G T 2 , 
define the light couples C = C(x,y) to be all those ordered pairs (it, v) so that |x u ?/t,| < and let 
the heavy couples H be all those pairs that are not light. 
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4.1. Controlling the contribution of the light couples. Part of the advantage of having selected 
only the light couples is that their expected contribution is of the "correct" order, as the lemma below 
shows. 



Lemma 25. 



E ^2 x u M uv y v 



< 2Vd. 



Proof. By symmetry, EM U „ is simply equal to ^ , so that 

E ^ ^ x u AI uv y v — — — ^ ^ x u y v . 

{u,v}£C {u,v}£C 

Because each of x u and y v sum to 0, the sum over light couples is equal in magnitude to the sum over 
heavy couples. Thus, it suffices to estimate 



^ ] x uyv 



{u,v}eH 
n 

Vd 



{u,v}£-H 



2 2 
x uVv 

\xuyv | 



ft v — ■> 



x uUtn by the defining property of heavy couples, 



{u,v}eH 



< 



y/d' 

In the last step we recall that both ||x||, ||y|| < 1. 



□ 



To show that not only the expectation, but the sum itself is of the correct order, we must prove 
a concentration estimate for this sum. For technical reasons, it is helpful if we deal with sums over 
fewer terms. To this end, define 

A = A 1 + A 2 + ■ ■ ■ + A d . 
In terms of A it is enough to insist that for every x, y G T 



^ x uA uv yi 
(u,v)ec 



for then by symmetry, 



< tVd 



< 2tVd, 



^ x v,M uv y v 
(u,v)ec 

for all x, y € T. As a further simplification, we will not prove a tail estimate for the whole quantity 
^2 x u A uv y v ; instead, fix an arbitrary collection U of vertices of size at most [2]. Having fixed 

(u,v)e£ 

this collection, we will show a tail estimate for Y^,( u v)eCnUx[n] x uA uv y v . This truncation is made to 
simplify a variance estimate (see (|28p). and it might be possible to avoid it entirely. 



Theorem 26. For every x,y £ T, and every U C [n] with \U\ < 



> 



tVd 



< C exp - 



Ci + C 2 t 



^ x u,A uv y v l*jX u A uv y v 

(u,v)£CnUx[n] 

for some universal constants Co, C\ and C 2 . These constants can be taken as 2, 64, and 8/3 respec- 
tively. 

Proof. Let C be C n U x [n]. We will estimate tail probabilities for x uA uv y v . 

(u,v)ec 

The main tool needed to establish this result is Freedman's martingale inequality |Fre75j . Let 
Xi,X 2 , ... be martingale increments. Write J?/c for the natural filtration induced by these increments, 
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and define Vu — E \X% \ &k-i] ■ If S n is the partial sum S n — Y^l=i (with So = 0) and T n is the 
sum T n — y^!_ l Vi (with To = 0), then by analogy with the continuous case, one expects S n to be a 
Brownian motion at time T n (a discretization of the bracket process). The analogy requires, however, 
that the increments have some a priori bound. Namely, if \Xk\ < R, 



P [3 n < t so that S n > a and T n > b] < 2 exp 



a 2 /2 



Remark 27. The constants quoted here are slightly better than the constants that appear in Freed- 
man's original paper. This statement of the theorem follows from Proposition 2.1 of [Frc75; and the 
calculus lemma 

u 2 /2 



(1 +u)log(l +u)-u> 



l + u/3' 



for u > 0. 



Reorder and relabel the vertices from U as xi,X2, ■ ■ ■ ,x r , with r < [5] so that \xj\ decreases 
in j. Order pairs € [d] x {0, 1,2, ...r} lexicographically, and enumerate iTi(j) in this order as 

/ij/2) • • • ) frd- Define a filtration of u-algebras {^kYkLi by revealing these pieces of information, i.e. 
&k = J^k-i V 7r(/fe)- According to this filtration, let 



Sk — E 



(u,u)e£ 



define a martingale and let Xk = -XVij) be the associated martingale increments. 
The desired deviation bound can now be cast in terms of Sk as 



< 2 exp 



> t 



t 2 /2 



< P [3 k < rd so that |Sfc - S \ = \S k \ > t and T n > b] 



(f + b) 



provided that b satisfies 



rd 



k=l 



This reduces the problem to finding suitable R and b. The starting point for finding any such 
bound is simplifying the expression for the martingale increments Xu^y To this end, let 7r be a fixed 
permutation of [n], and define Life to be the collection of all permutations that agree with ir in the 
first k entries, i.e. 

Life = {a : a(i) = i = 1, 2, . . . , k}. 

Further let T : Tlk-i — > life be the map which maps a permutation to its nearest neighbor in Life, in 
the sense of transposition distance, i.e. 

{7r(fc) i = k 
a(k) i = o- 1 {Tr(k)) . 
a(i) else 

Note that this map is the identity upon restriction to Life. Let be the characteristic function for 

(it, v) G C. In terms of these notation, it is possible to express -XVj.k) as 

1 



X, 



n 



fe-i 



- x u L 

[u,r(u)]yr(u) 1 
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where 7r = er,;, and the contributions of the other <jj all cancel. As t(u) = T[t](u) except for when 
u = k or u = T^ 1 {ir{k)), this simplifies to 



n 



fe-i 



^ (z u £[„,,r(fc)]2/,r(fc) - X u L[ UT ( k ^y T 



+2! T - i(7r(fc))i[ r - 1 (7r(fc))^(fc)]^(fe) ~ X ">"~ 1 (t(*0) ^[t— 1 O(Jfe)) , 7r(fe)] J/ir(fc) ) • 

This can be recast probabilistically. Define two random variables v and u as 

v ~ Unif {[n] \ n[k]} , 
m ~ Unif {[n] \ [k]} , 

(where [n] = {1, 2, . . . , n}) so that 

Terms for which ir(k) — r(fc) again cancel, and so we have disregarded these terms from the right 
hand side. It is also for this reason that the small correction appears in front of X^. From here it is 
possible to immediately deduce a sufficient a priori bound on X k , as each term in this expectation is 



at most 



so that 



l*fe|<4^ 



Vd 



The conditional variance E [X| | J^fe_i] is not much more complicated. Effectively, we take ir(k) to 
be uniformly distributed over [n] \ n[k — 1] and bound E \X 2 | ,^k-i] by 

E [ x k | ^k-l] < 4E [x 2 k {L [k ^ v] y v ) 2 + a;fc(i[ fc , 7r( A ; )]y ;r (fe)) 2 + x 2 u {L [u ^ (k)] y^ {k) ) 2 + x 2 a (L [uM y v ) 2 \ & k -i] ■ 

As we have ordered the Xi, x\ < x\. Further, by bounding all the £r a ,6] terms by 1, and using that v 
is marginally distributed as Unif {[n] \ w[k — 1]} , this bound becomes 

E [X 2 | JVJ < 16E [x 2 y 2 v | ^ fc _ x ] . 

Upon explicit calculation, we see that 



E [y 2 v | JT fc _l] = 



n — k 



E 



vl< 



1 



n — k 



[n]Vr[fc-l] 

where it has been used that ||y| < 1. Combining the above with (f2"T)) . we see that 



(28) 



E [X 2 | J^r] < 



- k 



1 2 



n-k + l 



16xj 

i — k 



< 



■yixi 



where it has been used that k < r < \^~\ . Summing over all martingale increments, 

^ > ^ > 32x^ 32c? 
•f-f f-^ n ~ n 

2—1 k—1 

Thus the Freedman martingale bound becomes 



^ XuA uv y v J?iX u A uv y v 



> 



t\fd 



( ~nt 2 \ 



□ 
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Let Aeft be the set of vertices that appear in the first coordinate of some light couple, and choose 
U C Cieft arbitrarily so that \U\ — |~|£ieft|/2] . It follows then that, if U\ := U, and f7 2 := £i e f t \ Ui, 



^ XuA uv y v T*)x u A uv y v 



> 



tVd 



< 2P 



max 

i=l,2 



^ XuA uv y v T*jX u A uv y v 

(ti,»)6£n[/iX[n] 



From this point, it is possible to estimate 



3 x, y E T 



> 



2{2t + l)Vd 



by 



3 x, y G T : x u[Auv - 

uv J Vv 

CnUx[n] 

Applying the union bound and Theorem 1261 we see now that 



> 



tVd 



3 x,y £ T 



> 



2(2* + l)Vd 



< c 


(2 + 6)y/2eiv 
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exp 



-nr 



64 + 8i/3 



so that taking e — 2 > 6 > i and i = 27, it is seen that this probability decays exponentially fast, and 
we have proven 

Theorem 28. There are universal constants C and K sufficiently large and c > so that for e — 2 > 
6 > 5 and except for with probability at most 



K exp (—era) . 



there is no pair of vectors x,y £ T having 



(u,v)ec 



> cV2d. 



It is possible to take C = 110. 

4.2. Controlling the contribution of the heavy couples. 

Lemma 29 (Discrepancy). For any two vertex sets A and B, let e(A,B) denote the number of 
directed edges from A to B that result as a form 7Tj(a) = b for some l<i<d, a£A and b £ B. 
Let fJ.(A, B) = \A\\B\—. For every m > 0, there are constants c\ > e and c 2 so that for every pair of 
vertex sets A and B, except with probability n~ m , exactly one of the following properties holds 

(!) either < ci , 

(2) re(A,B)log^|) < C2 (|A| V |S|) log 

It is possible to take c\ — e 4 and c 2 = 2e 2 (6 + m). 

To prove this lemma, we rely on a standard type of large deviation inequality shown below, which 
mirrors the large deviation inequalities available for sums of i.i.d. indicators. 

Lemma 30. For any k > e, 



P [e(A, B) > kn(A, B)} < exp(-/c[log k - 2}jj). 



LIMIT THEOREMS 



27 



Proof. Let e 7r (A ) B) denote the number a E A so that 7r(a) € -B. It is possible to bound 

[a]t[b]t 



P[e n (A,B)=t] < 



where we recall that [a]t = a(a — 1) . . . (a — t + 1) is the falling factorial or Pochhammer symbol. Using 
the fact that [n]t > e~ t n t , this may be bounded as 

n h p 

so that the Laplace transform of e w (A, B) can be estimated as 



E[exp(\e n {A,B))]<J2< 
t=o 

Thus by Markov's inequality, we have 

P[e(A,B) > kfj,(A,B)} < 



M 1 



tin* 



exp 



abe 1+x 



E 



exp(AEtie CT( (A5) 



< exp [^e 1+x - kXfj] , 

where A > is any positive number and fj, = n(A, B). Taking 1 + A = log fc, valid for k > e, it follows 
that 

P [e(A, B) > kfi(A, B)] < exp [-fc(log k - 2)/j] , 
for k > e. □ 

Armed with Lemma we can proceed with the proof of Lemma |2"51 

Proof of LemmaW5[ If either of \A\ or \B\ is greater than -, then e(A, B) < (\A\ V \B\)d, so that 

< e. 



e(A,B) < nd(\A\V\B\) 



fi(A, B) ~ \A\\B\d \A\ A \B\ 
Thus, it suffices to deal with the case that both A and B are less than —. In what follows, we will think 

' e ' 

of a and 6 as being the sizes of |^4| and \B\ in preparation to use a union bound. Let k = k(a, b, n) be 
defined as k = maxjfc*, -}, where k* satisfies 



k* log A:* 



(6 + m)(a V b)n n 
- lo? 



abd ° a V b ' 

or i, whichever is larger. When a\/ b < 2 ; it follows that 

(6 + m) (a V b) log ^ > 2a log a + 2 & log f + (2 + m) (a V 6) log ^_ , 
where we have used the monotonicity of a; log 2 on [1, —]; thus 

(6 + m)(a V 6) log ^ > a(l + log 2) + 6(1 + log f ) + (2 + m) logn. 

Exponentiating, 

exp[fclogfc^] > (f)"(f)% 2+m , 

if k > -. It follows that 

— e 

P [3A, 5 with |A| = a, |B| = 6, so that e(A, B) > e 2 k(a, b)n(A, B)] 



< 



exp(— e 2 /c[log k}/i) < n 



-2-m 



Moreover, applying this bound to all a and 6, it follows that 

e(A,B) < e 2 k{\A\, \B\)fi(A,B), 
except with probability smaller than n~ m . If for two sets A and B, k = -, then 

e(A,B) < efi(A,B), 
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and we are in the first case of the discrepancy property, for c\ > e. Otherwise 
e(A,B)logk < e 2 klogk^i(A 1 B) = e 2 (6 + to) (a V b) log 



V6' 



and noting that k > e 2^^g^ , it follows that 



ie(A,B)]og ) : > <e{A,B)\og _V' ^ < e 2 f 



! (6 + to)(o V&) log- 



n(A,B) ~ y ' ' *e"*!i(A,B)- v ;v ; b aVi)' 

when ^4 > e 4 . If this is not the case, then we are again in the first case of the discrepancy property, 
taking c\ > e 4 . Taking c\ — e 4 , it follows that we may take c 2 = 2e 2 (6 + to). □ 

The discrepancy property implies that there are no dense subgraphs, and thus the contribution of 
the heavy couples is not too large. 

Lemma 31. If the discrepancy property holds, with associated constants c\ and c%, then 

y v \ < CVd, 

{u,v}e"H 

for some constant C depending on c\ , ci , and S. 

Proof. The method of proof here is essentially identical to Kahn and Szemeredi or Feige and Ofek 
(see [FKS89] or |FO05j ). We provide a proof of this lemma for completeness as well as to establish 
the constants involved. We will partition the summands into blocks where each term x u or y v has 
approximately the same magnitude. Thus let % = 2 l 5, and put 

Ai = {u\ ii-< |^| i<i<io g ry^- 

B t = {u\--^< \y u \ < 1 < i < log'TV^l- 

Let T-L denote those pairs so that ji'jj > Vd. The contribution of the absolute sum can, in these 
terms, be bounded by 

\xuM u , vVv \< J2 ^e{A u B ). 
(u,v)eu (i-j)en 

Let A,.j = ^f^.'fg . j denote the discrepancy, which can be controlled using Lemma [521 In terms of this 
quantity, the bound becomes 

TTTj 
n 

(u,v)eu (i,3)en 

In this form, the magnitudes of each of the quantities are somewhat opaque. Consider the sum 

2 2 
J2i 1 it is at most 4||a;|| 2 = 4. In particular, it is of constant order. Thus let ctj = |^|— and 

f3j = LBj-M-. This allows the bound to be rewritten as 



«2 



^ . n n 7i7j v d ^ . 

This exposes the quantity having some special importance. In effect, we will show 

that either for fixed i, y\ has constant order, or for fixed j, yj 4 OijOn has constant order. 

In what follows, we will bound the contribution of the summands where \ AA > \Bj\. By symmetry, 
the contribution of the other summands will have the same bound. The heavy couples will now 
be partitioned into 6 classes {"Hi}f =1 where their contribution is bounded in a different way. Let 
Hi C % be those pairs (i,j) which satisfy the i th property from the following list but none of the prior 
properties: 

(1) 0~ij < Ci. 

(2) \ij < Cl . 

(3) 7j >\Vd~ ll . 
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(4) logAij > \ \ 2 log 7 ,+ log ± . 

(5) 2 log 7 , >log^. 

(6) 2 log 7i <log^. 

The last properties arc better understood when the second case of the discrepancy property is expressed 
in present notation. In its original form, it states 

e(Ai,Bj)\og\ it:j < c 2 |.4i|logp^|. 
Substituting -ff /an for n/|-Aj| and multiplying both sides of this equation through by — - — 3=- — - — 

\Bj |7j V d log Xi j 

produces the equivalent form 



L < c 2 . 



7j 



21og7 4 + log-i- 



' Vdji log Xi,j 

Thus, the last 3 cases cover each of the possible dominant log terms in this bound. 

4.2.1. Bounding the contribution of Hi and V.2-- In cither of these situations, we have a bound on 
aij. Especially, either aij < c\ or, all the discrepancies Xij are uniformly bounded by c\. As 



and 7^7^- > \[d, 



for both cases. 



Oij < Ci 



4.2.2. Bounding the contribution ofH^. For these terms, we fix j. In this case, the magnitudes of the 
entries corresponding to j of y v dominate those of the entries corresponding to i of x u . However, by 

2 

regularity e(Ai,Bj) < \Bj\d, so that the discrepancy Xij is at most - 2 - — — 



E ■ 



E 



Xi.-jVd 
Ui— < 

HI3 



E 



< 



where in the last step it has been used that the sum is geometric with leading term less than 47^ / \fd. 

4.2.3. Bounding the contribution of H4. For these terms, we fix i. We are not in case (2), and it 
follows that the second case of the discrepancy property holds. In present notation 



2 log 7, + log-±- 



< 



d-ji logA^- "fiVd' 

where the hypothesis has been used. As we are not in case (3), the sum of these terms is bounded as 

E P3°i,3 ' 
3 ■ (»,j)e«4 

where it has been used that the sum above has a geometric dominator with leading term at most 

4.2.4. Bounding the contribution of IHL5. For these terms, we fix i. Again, the second case of the 
discrepancy property holds. Now, in addition, 

log A;., < i 



21og7 4 + log-±- <log7i, 



i.e. that Xij < 7$. Furthermore, we are not in case (1) so c\ < <Ji t j 
discrepancy bound becomes 



hj^L < Thus the second 

im — n 



Vdjj 



21og7i +log-i- 



log X itj 



7 j 41og7 i 4c 2 7j 

S °2—/= S 7=1 

Vd7ilogci ci 
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where it has been used that 7$ > Xij > c\ > e, and that logx/a; is monotonically decreasing for x > e. 
Thus, 



E < E 



4c 2 7j 8C2 



3 ■ (i,j)e«5 J : (»,i)£K 5 

where it has been used that the second sum above is geometric with largest term \fdjc\. 

4.2.5. Bounding the contribution ofHg. For these terms, we fix j. The second case of the discrepancy 
property holds and in addition, 



log Aw < 1 



2 log 7, +log^- 



1 



This implies that a satisfies the asymmetric bound Ui i < -L^t2-. Thus, 

E E ^f^ 2 ' 

where it has been used that the sum above is geometric with leading term (which follows as 

4.2.6. Assembling the bound. We must sum the contributions of each of the classes of couples. Recall 
that we must double the contribution here because we have only considered couples where \Ai\ > \Bj\. 
In each of the cases outlined above, it only remains to sum over the ai or j3j in each bound. Doing so 
contributes a factor of 4 to each bound, so that the constant can be given by 

16ci + 32 + 8c 2 + + 8 

c i 

□ 

4.3. Finalizing the proof of Theorem 1241 

Proof. We will take 6 = |. With m given, it follows the discrepancy property fLemma |2"91 holds 
with probability at least 1 — n~ m , and with constants C\ = e 4 and c 2 = 2e 2 (6 + m). Therefore, by 
Lemma 1311 for any two x,y € 7", the contribution of the heavy couples to x t My (which is at most 
twice the contribution of x* Ay, given that the bounds hold for all x and y) is at most 

32c 2 



16ci + 32 + 8c 2 



Vd< (8854 + 585m) Vd. 



By Theorem |2"51 with probability at least (1 — C exp(— cn) for some universal constants C > and 
c > 0, the contribution of the light couples is never more than llO-s/d. Hence 

sup ^My] < (8964 + 585m) Vd, 

except with probability at most n~ m + C exp(— cn). At last, this implies that A 2 V |A„| < 4(8964 + 
5857™)-^, except with probability at most n~ m + Cexp(— cn). □ 

5. Linear statistics of eigenvalues 

We now connect Section T3.2I to linear eigenvalue statistics of the adjacency matrix of G n . Let 
{T n (x)} n £f$ be the Chebyshev polynomials of the first kind on the interval [— 1, 1]. We define a set of 
polynomials 



(29) 
(30) 



T (x) = 1 , 
T 2k (x) = 2T 2k (^j 



2d -2 



, V k> 1 



(2d - l) k 

(31) T 2k+1 (x) = 2T 2k+l (|) , V k > . 

We note that much of the following proposition can be found in Lemma 10.4 of |Fri08] . 
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Proposition 32. Let A n be the adjacency matrix of G n , and let \\ > ■ ■ ■ > X n be the eigenvalues of 
(2d-l)- 1 / 2 A n . Then 

n 

N { k n) :=£r fc (Ai) = (2d- l)- fe / 2 CNBWi n) . 

i=l 

Proof. To show the above, we will first use the Chebyshev polynomials of the second kind on [—1,1], 
namely {t/„}„ eN - 
Let 

(32) ^) = ^(f)-2^T C/ - 2 (f)- 

It is known [ABLS071 eqn. 12] that (2d - l)- fe / 2 NBW^ l) = J27=iPk(^)- We thus proceed by relating 
CNBWjj? 5 to NBWj: n) . 

A closed non-backtracking walk of length k is either cyclically non-backtracking or can be obtained 
from a closed non-backtracking walk of length k — 2 by "adding a tail," i.e., adding a new step to the 
beginning of the walk and its reverse to the end. For any closed cyclically non-backtracking walk of 
length k — 2, we can add a tail in 2c? — 2 ways. For any closed non-backtracking walk of length k — 2 
that is not cyclically non-backtracking, we can add a tail in 2d — 1 ways. Hence for k > 3, 

NBW*™) = CNBWj^ + (2d - 2)CNBWj£_ ) 2 + (2d - 1) (NBW^ - CNBW^ 

= CNBW<: n) + (2d - 1)NBW[™ ) 2 - CNBW^ 2 . 

Applying this relation iteratively and noting that CNBW^ 1 ' = NBW^™' 1 for k = 1, 2, we have 



CNBWj^ = NBW^ n) - (2d - 2) (NBW^ + NBW^™ } 4 + • • • + NBW<, 



with a = 2 if k is even and a = 1 if k is odd. Observe now that 

/ P2k-2(x) P2k~i(x) . Pi{x) 



^2k(x) = P2k(x) - (2d - 2) 



and 



r 2 fe_i(a;) = P2k-i(x) - (2d - 2) 
A quick calculation shows now that 



V 2d- 1 (2d- l) 2 (2d- l) fe -! 

( P2ks(x) , P2k-5(x) , Pi(x) 



V 2d - 1 (2d - l) 2 (2d - 1) 



k-1 



T 2k (x) = U 2k (f)-£ft»-2 (f) + (2d _ 1)fc /2 
i(x) = W(f)-^-i(f) . 



while 



T2fc+ll 

and the rest follows from the fact that Tk(x) = \ (Uk(x) — Uk-2(x))- □ 



The weak convergence of the sequence (CNBWl , 1 < k < r n ) in Theoreml2"T1 allows us to establish 
limiting laws for a general class of linear functions of eigenvalues. First we will make some canonical 
choices of parameters {r n }. Define 

(33) r n = ^."f" , for some < 1/2. 

log(2d - 1) 

Note that 2r n log(2d — 1) = 2/3 log n, which shows (fP8|) . even when d grows with n. 
We now need another definition. Let h be a function on M such that 



(34) 



h(r n ) > log(2d — 1), for all large enough n. 
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This definition is not so important when d is fixed, since a constant h(x) = log(2d— 1) for all x £ M. is 
a good choice. However, when d grows with n, an appropriate choice needs to be made. For example 
when 2d — 1 = (logn) 7 for some 7 > 0, one may take 

(35) h[x) = Clogx, for some large enough positive constant C. 

For our next result, we will use some theorems from Approximation Theory. Recall that every 
function / on [—1,1] which is square- integrable with respect to the arc-sine law has a series expansion 
with respect to the Chebyshev polynomials of the first kind. Good references for approximation theory 
and the Chebyshev polynomials are the book jMH03] and the (yet unpublished) book [Trellj . 

Recall the polynomials Tk(x) as defined in (|29[l ; if a function has a series expansion in terms of 
Chebyshev polynomials of the first kind, T k (x), on [—1,1], then it has a series expansion in terms of 
T k (x) on [-2,2]. 

We recall the definition of a Bernstein ellipse of radius p. 

Definition 33. Let p > 1, and let Sb(p) be the image of the circle of radius p, centered at the origin, 
under the map f(z) = — . We call £b(p) the Bernstein ellipse of radius p. The ellipse has foci at 
±1, and the sum of the major semiaxis and the minor semiaxis is exactly p. 

To prove our main result for d fixed, we first need a lemma. 

Lemma 34. Suppose that d > 2 is fixed. Let f be a function defined on C which is analytic inside a 
Bernstein ellipse of radius 2p, where p = {2d — l) a , for some a > 2, and such that |/(.z)| < M inside 
this ellipse. 

Let fix) = Yli^o c i^i( x ) f or x on [ — 2)2] (the existence, as well as uniform convergence of the 
series on [—2,2], is guaranteed by the fact that f is analytic on [—2,2]). 
Then the following things are true: 

(i) The expansion of f(x) in terms ofTi(x) actually converges uniformly on [—2 — e, 2 + e] for 
some small enough e > 0. 

(ii) The aforementioned series expansion also converges pointwise on [2, -^==]. 

(iii) If fk '■= Y)j—Q CiTi is the kth truncation of this (modified) Chebyshev series for f , then, for a 
small enough e, 

sup \f(x)-f k (x)\ <M'(2d-iy a ' k , 

0<M<2+e 

where 2 < a' < a, and M' is a constant independent of k. 

(iv) For all k G N, let bk = ( 2 d-i) fc > an< ^ ^ u>k ^ e ^ e sequence of weights described in Theorem 
1211 Then the sequence of coefficients {ck}keN satisfies 

Ck ^ e L 2 (w) . 



k£N 



,(2rf-l) fc / 2 Wfc , 

Proof. We will prove the facts (i) through (iv) in succession. 

Facts (i) and (ii) will use a particular expression for T n (x) outside [—1,1], namely, 

(x - Vx^ir + (x+ Vx^—ir 

(36) T n (x) = . 

For Fact (i), it is easy to see that if x is in [—2 — e, 2 + e], and particularly for e small enough, 

|r fc (a;)| <C(l + 3^) fe , 

where C is some constant independent of k. 

By Theorem 8.1 in |Trell) . which first appeared in Section 61 of |Berl2j . it follows that 

(37) \c k \ <M'{2d-l)- ak , 

for some constant M' which may depend on M and d, but not on k. 

Note that 1 + < (2d — 1)", for any d > 2, a > 2, and e small enough. 

Consequently, the series X)fe!Lo c k^k(x) is absolutely convergent on [—2 — e, 2 + e], and hence the 
expansion of / into this modified Chebyshev series is valid (and absolutely convergent) on [— 2 — e, 2+e]. 
This proves Fact (i). 
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Similarly, we now look on the interval [2, -^==], and note that on that interval the expression for 
T n [x/2) will be bounded from above by 

\T n (x/2)\ < ; 



indeed, this happens because x/2 — yjx 2 /A — 1 is decreasing (and maximally 1, at s = 2) while 
x/2 + x 2 /4: — 1 is increasing (and maximally (2d — l)™/ 2 , at x = 2dj\/2d — 1). 
From here it follows once again that 

|r n (x)| <2(2d-i)«/ 2 , 

on [2, -7=^=], and thus the series ^2^=0 CkTk(x) is absolutely convergent on this interval as well. The 
equality with the function / follows from analyticity. This proves Fact (ii). 

Fact (iii) is an immediate consequence of (|37[) . by taking e small enough relative to d and a. 

Fact (iv) follows easily from the definitions of uik, ©fe (given in Theorem |2"TT) . and from (1571) . □ 

We can now present our main result for the case when d is fixed. 

Theorem 35. Assume the same conditions on f and notations as in Lemma \S%\ Then the random 
variable X)"=i /(^*) — nc converges in law to the infinitely divisible random variable 

00 

^g(2^CNBWr>. 

Remark 36. There is a good explanation of why we must subtract hcq in the statement of the above 
theorem. Consider the Kesten-McKay density, normalized to have support [—2, 2]: 



p2d(x) 



2d(2d-l)V4~ 



2w(4d 2 - (2d-l)x 2 )' 

It is proved in McK81 that in the uniform model of random d- regular graph, the random variable 
n_1 Yli=i /(Ai) converges in probability to /_„ f(x)pd{x)dx. This also holds for the present model; 
one can prove it by applying the contiguity results of GJKW02 , or by using the above theorem to 
compute that limn^oo n _1 Yn=i ^ s * ne ^ n moment of the Kesten-McKay law. 

If y^Lj /(Af) converges in distribution (without subtracting the constant), then n -1 X]™ =1 /(A,) 
converges to zero in probability. Thus such a function / must be orthogonal to one in the L 2 space of 
the Kesten-McKay law. It has been shown in )Sod07[ Example 5.3] that the polynomials (pk), defined 
in (|3"2"1) . along with the constant polynomial po = 1 constitute an orthogonal basis for the L 2 space. 
The polynomials (r^), being linear combinations of (pk, k > 1), are therefore orthogonal to one in 
that L 2 space. Hence for any / of Theorem [331 the function /— Co is orthogonal to the Kesten-McKay 
law. 

Proof. Armed with the results of Lemma [3~H the proof is simple. 
We first claim that 

y} n) ■= t ^ - 1 m Ym^ cmw ^ k 

k=l fe=l V ' k 

converges in law to Yf as n tends to infinity. This follows from Theorem [5T] and Lemma @] once we 
show that the sequence 

Cfc ^ G L 2 (^). 



(2d-l) k / 2 u J k / 

This is precisely Fact (iv) from Lemma IM1 

The result will now follow from Slutsky's theorem once we show that, for any d > 0, 



(38) lim P 



J2 /(Ai) - nc Q - Y } 



(n) 



>S \ =0. 
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The proof of ([55]) has two components. Choose the parameter ft in ([55)1 such that a/3 < 1. This 
also implies (3 < 1/2. We start by noting 

n r n n—l 

nc + Y f {n) =Y,Y, c ^k(K) = frM + 

i=i k=i i=i 

Recall that the first eigenvalue of A n is exactly 2d, irrespective of n. Thus, once we scale A n 

converges as a deterministic sequence to 



by y/2d~T, by Fact (ii) fr om Lemma 
/ j^ ^2d-i ) ' Choose a l ar g e enough ri\ such that 



(- 



2d 



-f 



2d 



< 6/4, for all n > n\. 



V y/2d -I J \y/2d- 1 
On the other hand, if we define the event 

A„ ■■= {|Aj| < 2 + e, for all i > 1} , 

Theorem 1.1 in |Fri08j . shows that P (A n ) > 1 — cnT T , for some positive constants c and r. On this 
event, Fact (i) from Lemma [Ml together with ([55)1 , implies that 



n-l 



l/( A ») - /^( A i)l ^ (n- 1)M exp(-or n log(2d- 1)) = Mriexp(-a^logn) = Mn 

i=2 

Choose a large enough ri2 such that the above number is less than 5/4. 
Thus, for all n > max(ni, 71,2), we have 



-a/3+i _ 



o(l). 



^2f(Xi)-nco-Y } 



(n) 



>S)< P(A n ) = cn- T = o(l). 



This completes the proof. 



□ 



Remark 37. We now take a moment to demonstrate how to compute the limiting distribution of 
Sj=i rfc(Aj) wnen d = 1 using the results of |BAD11) . and we show that it is consistent with our 
own results. (Though in this paper we focus on d > 2, our techniques apply for d = 1, too, and prove 
nearly the same result as Theorem 1351 ) Let M n be a uniform random n x n permutation matrix with 



eigenvalues e 2mVl , . . . , e 27 ™ v ™ on the unit circle. Let A % 



M n 



M„ with eigenvalues Ai , . . . , A„ , 



which satisfy Xj = 2 cos (2iripj) . We define f(x) = Tk(2cos(2i:x)) — 2cos(2nkx) + c/., where = 
when k is odd and = (2d — 2)/ (2d — l) fe / 2 when k is even. Then J^j=i ^(A,-) = X)j=i f(fj)- 

Theorem 1.1 of |BADllj gives the characteristic function of the limiting distribution of J2j=i fifj) 
E Ej=i/Oi) as 



£/(*) = cx p / ( 



1 - itx)dMf(x) 



with Afy given by 



X3 ^ 

M / = E-^(/)' 
3=1 J 



h=0 



f(x)dx. 



It is straightforward to calculate that 



RAf) = 



2 i£j\k, 

otherwise. 
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Thus we find 

A/W=cxp|]r V* tJ -l)-2zt) 

\j\k 3 

which is the characteristic function of CNBWj^ — E[CNBW^] for d = 1 (note that a(d, k) = 2 in 
this case). 

Finally, we consider now the case of growing degree d = d n and the relationship between d n and 
r„, as given in the statement of Theorem [22] and in (l33|) . Although we have chosen not to use the 
notation d n elsewhere in the paper, we will use it here, to emphasize each pair (d n , r n ). For our results 
to be applicable, we will need that both d n and r n grow to oo. 

We will first remove the dependence on d n for our orthogonal polynomial basis, making them scaled 
Chebyshev. Define 



k > 1. 



(39) $ o (a0 = 1, 

(40) * k {x) = 2T k (|) 

If A n is the adjacency matrix of G n and Ai > • • • > A„ are the eigenvalues of (2d n — \)~ 1 / 2 A n and 
k > 1, then 



$ J (2d„ - l)- fc / 2 (cNBW^ n) - (2d„ - 2)nj if fe is even, 

^ * ' \(2d„ - l)- fc / 2 CNBW^ l) if k is odd. 



i=l 

Please note from (l23l) that 



Er=i^( A 0-(2d„-l)^ /2 (MfeK)-(2d„-2)n) if A: is even 
lEr=i^(^)-(2d«-l)- fc/2 MfeK) if A: is odd 



Our final result is very similar in spirit to Theorem l35l and we will need a helpful tool like Lcmma[3~4" 
to make it work. 

Lemma 38. Suppose now that d n , r n are growing with n and governed by (|33p . Consider the poly- 
nomials <&k as in (|39p . Let f be an entire function on C. Let a > 1 be a fixed real number. Then 

(i) / admits an absolutely convergent (modified) Chebyshev series expansion 



i=0 

on [—a, a] : 



(ii) for some choice of weights lu = (bk /k 2 log k) ken from Theorem \22\ the sequence of coefficients 
(cfc)fceN satisfies 

(41) G L 2 fe). 



Proof. Both Facts (i) and (ii) follow in the same way as the proofs of Facts (i) and (ii) from Lemma 
l34l noting that, since / is entire, it is sufficient to choose a Bernstein ellipse of radius large enough. 
This will provide a fast-enough decaying geometric bound on the coefficients, to compensate for the 
bounds on the growth of the T n {x) as given by (14"3")) , on the fixed interval [—a, a]. 

We detail a bit more the proof of Fact (ii), since it is only (slightly) more complex. Choose for 
example bk — jz', since / is entire, choose the Bernstein ellipse of radius 3C, on which / is bounded 
by some given B; as in the proof of Theorem [35j this states that the coefficients c n are bounded by 

(42) \c n \ < B'(3C)- n , 

for some B' independent of n. 
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As before, thanks to the expression we can bound the growth of the modified Chebyshev 

polynomials on [— C, C] by 

(43) max \T n (x/2)\ < B"C n , 

x£[~C,C] 

for some B" independent of n. 

With these choices for uj and (bfc)feeNj (HU) follows now from (|42l) and (14"51) . □ 

We can now give our main result for the case when d n and r n both grow. The essential difference 
from before is in the centering and in assumption (ii) below which stresses the dependence on the 
growth rate of the degree sequence. 

Theorem 39. Assume the same setup as in Lemma [38l with the following additional constraints on 
the entire function f : 

(i) Let C := C(l) be chosen according to Theorem 1241 Let f k := Cj$j denote the kth 

truncation of this series on [—C,C]. Then 

sup \f(x) — fk{x)\ < M exp (~akh(k)) , for some a > 2 and M > 0, 

0<|x|<C 

where h has been defined in (f3~4l . 
(ii) Recall the definition of sequence (r n ) from (|3"3"|) with a choice of (3 < 1/a. Then f and its 
sequence of truncations, f Tn , satisfy 



lim 



f rn (2d n (2d n - I)- 1 ' 2 ) - f (2d n (2d n - I)- 1 ' 2 ) 



= 0. 



Define now the array of constants 

k 

m k( n ) : = E WA 1 I W2 (Mi(^) ~ l(i is even){2d n - 2) 
i=1 {£Un rj 

// conditions (i) and (ii) above are satisfied, the sequence of random variables 

Vi=i / neN 

converges in law to a normal random variable with mean zero and variance a 2 = X^feLi ^ c k- 

Remark 40. Note the significance of the term h(k). The presence of h(k), which is usually a logarithmic 
term as in (1351) . demands somewhat more than just analyticity of /. Similarly, requirement (ii) requires 
convergence of the truncations sequence, evaluated at points diverging to oo; it is a kind of "diagonal" 
convergence, which is not automatically satisfied even for entire functions. 

Proof. The proof is almost identical to the proof of Theorem |3S] and we only highlight the slight 
differences. As before, define 

nc + YW := £ c k N^ = £ £ c k $ k (\ t ) - m£». 
fe=l »=i \fe=i / 

To prove that r„ (/) converges in law to N(0,a 2 ), we use Fact (ii) from Lemma l38l together with 
assumption (ii); by Theorem l2"2l the convergence follows. We only need to show that 



i=l 



converges to zero in probability. The convergence for Ai is given by assumption (ii), while the rest of 
it is assured by assumption (i) and Theorem [MJ □ 
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6. Appendix 

We will compute the exact expression for the number of cyclically reduced words of length k on 
letters tti, . . . , TT d , tt^ 1 , . . . , tt^ 1 ; specifically, we will show 

Lemma 41. 

a(d, 2k) = (2d - l) 2k -l + 2d, a(d, 2k + 1) = (2d - l) 2fe+1 + 1. 

Proof. This is a quick exercise in inclusion-exclusion. The proof requires some notation, but this 
should not obscure the simplicity of the ideas. Define 

life = {7Tl, 7T 2 , . . . , TTd, TTi 1 ,^ 2 • ■ • > ± } 

to be all words of length k in these letters. Let G = Z/fcZ denote the cyclic group of order k, and for 
any subset S C G, define 

v s = {w = • • • w k -i e n fe | w s = s e s} , 

where the addition is performed in G. The essential observation is that 

( (2d) k -W k > \S\ 
\V s \ = <2d k = \S\,k even 

[t) k = \S\,k odd. 

To see the formula for k > \ S\, note that each Wi with i ^ S can be chosen freely from the alphabet. 
Moreover, once these are chosen, the word can be completed uniquely by the rules of Vs. The k = \S\ 
formula follow as in these cases, the word must be a single letter that alternates with its inverse, and 
this is only possible if the length of the word is even. 

Having established these formulae, we can compute a(d, k) by inclusion-exclusion, 

a(d,k) = £(-l) |S| l^| = ']T (f)(-l)W-* + 

SCG 1=0 ^ ' 

Noting that this is nearly the binomial formula, the desired expressions follow. □ 
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